XORing Elephants: Novel Erasure Codes for Big Data
نویسندگان
چکیده
Distributed storage systems for large clusters typically use replication to provide reliability. Recently, erasure codes have been used to reduce the large storage overhead of threereplicated systems. Reed-Solomon codes are the standard design choice and their high repair cost is often considered an unavoidable price to pay for high storage efficiency and high reliability. This paper shows how to overcome this limitation. We present a novel family of erasure codes that are efficiently repairable and offer higher reliability compared to Reed-Solomon codes. We show analytically that our codes are optimal on a recently identified tradeoff between locality and minimum distance. We implement our new codes in Hadoop HDFS and compare to a currently deployed HDFS module that uses ReedSolomon codes. Our modified HDFS implementation shows a reduction of approximately 2× on the repair disk I/O and repair network traffic. The disadvantage of the new coding scheme is that it requires 14% more storage compared to Reed-Solomon codes, an overhead shown to be information theoretically optimal to obtain locality. Because the new codes repair failures faster, this provides higher reliability, which is orders of magnitude higher compared to replication.
منابع مشابه
A Non-MDS Erasure Code Scheme for Storage Applications
This paper investigates the use of redundancy and self repairing against node failures indistributed storage systems using a novel non-MDS erasure code. In replication method, accessto one replication node is adequate to reconstruct a lost node, while in MDS erasure codedsystems which are optimal in terms of redundancy-reliability tradeoff, a single node failure isrepaired after recovering the ...
متن کاملThe CORE Storage Primitive: Cross-Object Redundancy for Efficient Data Repair & Access in Erasure Coded Storage
Erasure codes are an integral part of many distributed storage systems aimed at Big Data, since they provide high fault-tolerance for low overheads. However, traditional erasure codes are inefficient on reading stored data in degraded environments (when nodes might be unavailable), and on replenishing lost data (vital for long term resilience). Consequently, novel codes optimized to cope with d...
متن کاملVerification of Parity Data in Large Scale Storage Systems
Highly available storage uses replication and other redundant storage to recover from a component failure. If parity data calculated from an erasure correcting code is not updated or becomes otherwise corrupted, recovery from a failure does not recover the correct data but mostly garbled data. This paper presents an algebraic signature scheme that can detect parity discrepancies for parity calc...
متن کاملRelieving Both Storage and Recovery Burdens in Big Data Clusters with R-STAIR Codes
Enterprise storage clusters increasingly adopt erasure coding to protect stored data against transient and permanent failures. Existing erasure code designs not only introduce extra parity information in a storage-inefficient manner, but also consume substantial cross-rack recovery bandwidth. To relieve both storage and recovery burdens of erasure coding, we adapt our previously proposed STAIR ...
متن کاملRateless Codes and Big Downloads
This paper presents a novel algorithm for downloading big files from multiple sources in peer-to-peer networks. The algorithm is simple, but offers several compelling properties. It ensures low handshaking overhead between peers that download files (or parts of a files) from each other. It is computationally efficient, with cost linear in the amount of data transfered. Most importantly, when no...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 6 شماره
صفحات -
تاریخ انتشار 2013